Creating Large Scale Database Servers

نویسندگان

  • Jacek Becla
  • Andrew Hanushevsky
چکیده

The BaBar experiment at the Stanford Linear Accelerator Center (SLAC) is designed to perform a high precision investigation of the decays of the B-meson produced from electron-positron interactions. The experiment, started in May 1999, will generate approximately 300TB/year of data for 10 years. All of the data will reside in Objectivity databases accessible via the Advanced Multi-threaded Server (AMS). To date, over 70TB of data have been placed in Objectivity/DB, making it one of the largest databases in the world. Providing access to such a large quantity of data through a database server is a daunting task. A full-scale testbed environment had to be developed to tune various software parameters and a fundamental change had to occur in the AMS architecture to allow it to scale past several hundred terabytes of data. Additionally, several protocol extensions had to be implemented to provide practical access to large quantities of data. This paper will describe the design of the database and the changes that we needed to make in the AMS for scalability reasons and how the lessons we learned would be applicable to virtually any kind of database server seeking to operate in the Petabyte region.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FINDIT: A Server Based Approach to Finding Information in Large Scale Heterogeneous Databases

Finding information in large scale autonomous heterogeneous databases is an issue that has been virtually unexplored in database research. We describe a new approach for nding information in a large scale network of autonomous and heterogeneous databases. In our scheme, no a priori knowledge of the database schemas or the space of information is necessary to locate information. The system relie...

متن کامل

Separating indexes from data: a distributed scheme for secure database outsourcing

Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...

متن کامل

SLEDs: Storage Latency Estimation Descriptors

Managing the latency of storage systems is a key to creating effective very large scale information systems, such as web interfaces to satellite image databases and video-on-demand servers. Storage Latency Estimation Descriptors (SLEDs) are architecture-independent descriptions of the retrieval time of a unit of data. They describe the latency to the first byte, and the bandwidth expected. SLED...

متن کامل

A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation

The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an information-rich society into a nightmare of info-gluts. Many researchers believe that turning an info-glut into a useful digital library requires automated techniques for organizing and categorizing large-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000